21 research outputs found

    Bioinformatics services for analyzing massive genomic datasets

    Get PDF
    The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating down-stream analysis of genome data. Bio-Express web service is freely available at https://www. bioexpress.re.kr/. ?? 2020, Korea Genome Organization

    MONGKIE: an integrated tool for network analysis and visualization for multi-omics data

    Get PDF
    Background Network-based integrative analysis is a powerful technique for extracting biological insights from multilayered omics data such as somatic mutations, copy number variations, and gene expression data. However, integrated analysis of multi-omics data is quite complicated and can hardly be done in an automated way. Thus, a powerful interactive visual mining tool supporting diverse analysis algorithms for identification of driver genes and regulatory modules is much needed. Results Here, we present a software platform that integrates network visualization with omics data analysis tools seamlessly. The visualization unit supports various options for displaying multi-omics data as well as unique network models for describing sophisticated biological networks such as complex biomolecular reactions. In addition, we implemented diverse in-house algorithms for network analysis including network clustering and over-representation analysis. Novel functions include facile definition and optimized visualization of subgroups, comparison of a series of data sets in an identical network by data-to-visual mapping and subsequent overlaying function, and management of custom interaction networks. Utility of MONGKIE for network-based visual data mining of multi-omics data was demonstrated by analysis of the TCGA glioblastoma data. MONGKIE was developed in Java based on the NetBeans plugin architecture, thus being OS-independent with intrinsic support of module extension by third-party developers. Conclusion We believe that MONGKIE would be a valuable addition to network analysis software by supporting many unique features and visualization options, especially for analysing multi-omics data sets in cancer and other diseases. Reviewers This article was reviewed by Prof. Limsoon Wong, Prof. Soojin Yi, and Maciej M Kańduła (nominated by Prof. David P Kreil)

    Fluid Shear Stress Regulates the Landscape of microRNAs in Endothelial Cell-Derived Small Extracellular Vesicles and Modulates the Function of Endothelial Cells

    No full text
    Blood fluid shear stress (FSS) modulates endothelial function and vascular pathophysiology. The small extracellular vesicles (sEVs) such as exosomes are potent mediators of intercellular communication, and their contents reflect cellular stress. Here, we explored the miRNA profiles in endothelial cells (EC)-derived sEVs (EC-sEVs) under atheroprotective laminar shear stress (LSS) and atheroprone low-oscillatory shear stress (OSS) and conducted a network analysis to identify the main biological processes modulated by sEVs’ miRNAs. The EC-sEVs were collected from culture media of human umbilical vein endothelial cells exposed to atheroprotective LSS (20 dyne/cm2) and atheroprone OSS (±5 dyne/cm2). We explored the miRNA profiles in FSS-induced EC-sEVs (LSS-sEVs and OSS-sEVs) and conducted a network analysis to identify the main biological processes modulated by sEVs’ miRNAs. In vivo studies were performed in a mouse model of partial carotid ligation. The sEVs’ miRNAs-targeted genes were enriched for endothelial activation such as angiogenesis, cell migration, and vascular inflammation. OSS-sEVs promoted tube formation, cell migration, monocyte adhesion, and apoptosis, and upregulated the expression of proteins that stimulate these biological processes. FSS-induced EC-sEVs had the same effects on endothelial mechanotransduction signaling as direct stimulation by FSS. In vivo studies showed that LSS-sEVs reduced the expression of pro-inflammatory genes, whereas OSS-sEVs had the opposite effect. Understanding the landscape of EC-exosomal miRNAs regulated by differential FSS patterns, this research establishes their biological functions on a system level and provides a platform for modulating the overall phenotypic effects of sEVs

    Additional file 1: of MONGKIE: an integrated tool for network analysis and visualization for multi-omics data

    No full text
    Supplementary text, figures, and data files. All text and materials were formated as a small self-contained website (1 HTML file with necessary figures and data files). Data files include input and result files of the case study including the fold change of expression values between tumor vs. normal conditions (in log2FC), average expression value of each gene in 4 GBM subtypes, GBM-altered subnetworks (nodes and edges) weighted by expression correlations between each pair of genes, and gene sets in 2 critical modules and their functional annotations. (ZIP 4315kb

    Human cytomegalovirus induces and exploits Roquin to counteract the IRF1-mediated antiviral state

    Get PDF
    RNA represents a pivotal component of hostā€“pathogen interactions. Human cytomegalovirus (HCMV) infection causes extensive alteration in host RNA metabolism, but the functional relationship between the virus and cellular RNA processing remains largely unknown. Through loss-of-function screening, we show that HCMV requires multiple RNA-processing machineries for efficient viral lytic production. In particular, the cellular RNA-binding protein Roquin, whose expression is actively stimulated by HCMV, plays an essential role in inhibiting the innate immune response. Transcriptome profiling revealed Roquin-dependent global downregulation of proinflammatory cytokines and antiviral genes in HCMV-infected cells. Furthermore, using cross-linking immunoprecipitation (CLIP)-sequencing (seq), we identified IFN regulatory factor 1 (IRF1), a master transcriptional activator of immune responses, as a Roquin target gene. Roquin reduces IRF1 expression by directly binding to its mRNA, thereby enabling suppression of a variety of antiviral genes. This study demonstrates how HCMV exploits host RNA-binding protein to prevent a cellular antiviral response and offers mechanistic insight into the potential development of CMV therapeutics. Ā© 2019 National Academy of Sciences.11sciescopu

    Identification of tumor suppressor miRNAs by integrative miRNA and mRNA sequencing of matched tumorā€“normal samples in lung adenocarcinoma

    Get PDF
    Ā© 2019 The Authors. Published by FEBS Press and John Wiley & Sons Ltd.The roles of miRNAs in lung cancer have not yet been explored systematically at the genome scale despite their important regulatory functions. Here, we report an integrative analysis of miRNA and mRNA sequencing data for matched tumorā€“normal samples from 109 Korean female patients with non-small-cell lung adenocarcinoma (LUAD). We produced miRNA sequencing (miRNA-Seq) and RNA-Seq data for 48 patients and RNA-Seq data for 61 additional patients. Subsequent differential expression analysis with stringent criteria yielded 44 miRNAs and 2322 genes. Integrative gene set analysis of the differentially expressed miRNAs and genes using miRNAā€“target information revealed several regulatory processes related to the cell cycle that were targeted by tumor suppressor miRNAs (TSmiR). We performed colony formation assays in A549 and NCI-H460 cell lines to test the tumor-suppressive activity of downregulated miRNAs in cancer and identified 7 novel TSmiRs (miR-144-5p, miR-218-1-3p, miR-223-3p, miR-27a-5p, miR-30a-3p, miR-30c-2-3p, miR-338-5p). Two miRNAs, miR-30a-3p and miR-30c-2-3p, showed differential survival characteristics in the Tumor Cancer Genome Atlas (TCGA) LUAD patient cohort indicating their prognostic value. Finally, we identified a network cluster of miRNAs and target genes that could be responsible for cell cycle regulation. Our study not only provides a dataset of miRNA as well as mRNA sequencing from the matched tumorā€“normal samples, but also reports several novel TSmiRs that could potentially be developed into prognostic biomarkers or therapeutic RNA drug

    Splicing signature database development to delineate cancer pathways using literature mining and transcriptome machine learning

    No full text
    Alternative splicing (AS) events modulate certain pathways and phenotypic plasticity in cancer. Although previous studies have computationally analyzed splicing events, it is still a challenge to uncover biological functions induced by reliable AS events from tremendous candidates. To provide essential splicing event signatures to assess pathway regulation, we developed a database by collecting two datasets: (i) reported literature and (ii) cancer transcriptome profile. The former includes knowledge-based splicing signatures collected from 63,229 PubMed abstracts using natural language processing, extracted for 202 pathways. The latter is the machine learning-based splicing signatures identified from pan-cancer transcriptome for 16 cancer types and 42 pathways. We established six different learning models to classify pathway activities from splicing profiles as a learning dataset. Top-ranked AS events by learning model feature importance became the signature for each pathway. To validate our learning results, we performed evaluations by (i) performance metrics, (ii) differential AS sets acquired from external datasets, and (iii) our knowledge-based signatures. The area under the receiver operating characteristic values of the learning models did not exhibit any drastic difference. However, random-forest distinctly presented the best performance to compare with the AS sets identified from external datasets and our knowledge-based signatures. Therefore, we used the signatures obtained from the random-forest model. Our database provided the clinical characteristics of the AS signatures, including survival test, molecular subtype, and tumor microenvironment. The regulation by splicing factors was additionally investigated. Our database for developed signatures supported retrieval and visualization system
    corecore